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Abstract 

Motivated by Kleinberg's [fj and subsequent work, we consider the 
performance of greedy routing on a directed ring of n nodes augmented 
with long-range contacts. In this model, each node u is given an addi- 
tional D u edges, a degree chosen from a specified probability distribution. 
Each such edge from u is linked to a random node at distance r ahead 
in the ring with probability proportional to 1/r, a "harmonic" distance 
distribution of contacts. Aspnes et al. [l[ have shown an 0(log 2 n/^) 
bound on the expected length of greedy routes in the case when each 
node is assigned exactly £ contacts and, as a consequence of recent work 
by Dietzfelbinger and Woelfel this bound is known to be tight. In 
this paper, we generalize Aspnes' upper bound to show that any degree 
distribution with mean £ and maximum value O(logn) has greedy routes 
of expected length 0(log 2 n//), implying that any harmonic ring in this 
family is asymptotically optimal. Furthermore, for a more general fam- 
ily of rings, we show that a fixed degree distribution is optimal. More 
precisely, if each random contact is chosen at distance r with a proba- 
bility that decreases with r, then among degree distributions with mean 
£, greedy routing time is smallest when every node is assigned [£\ or \£] 
contacts. 



1 Introduction 

1.1 Background 

Our work extends results that lie at the intersection of mathematically modeling 
the small world phenomenon in social networks and the design of decentralized 
peer-to-peer networks. In both contexts, a central problem is determining how 
efficiently a message can be routed between arbitrary nodes of a network. 

The notion of a small world is most frequently encountered in the context 
of social networks. The term refers to systems where entities are highly clustered 
and linked to only a small portion of the network, but are nevertheless connected 
by short paths. Research, notably the letter-forwarding experiments conducted 
by Stanley Milgram in the 1960s @, suggests that small world networks exist 
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in the real world. The work of Kleinberg Q and others (see Kleinberg Q for a 
review) provides insight into conditions under which people can efficiently find 
short paths using only local information, as modeled by, say, greedy routing. 

Kleinberg's model begins with an n-by-n lattice of nodes. Each node is 
connected to all other nodes within a specified distance. Additionally, each 
node is given £ long-range contacts (or LRCs) chosen according to some 
stochastic process. Kleinberg considered power-law distributions, in which the 
probability that a node u chooses node v as an LRC is proportional to S^ 13 , 
where (3 is a constant and 8 is the distance from u to v. For [3 = 2 and £ = 1, 
he showed greedy routing takes 0(log 2 n) (this bound is tight [§]), whereas for 
j3 2 greedy routing time is bounded below by a polynomial in n. In general, 
the optimal value for /3 is equal to the dimension of the lattice. 

Subsequent work has instead considered a ring model. Barriere et al. 
showed that in this variation, B = 1 and £ = 1 allows 0(log 2 n) routing time. 
Aspnes, Diamadi, and Shah [f[ generalized this to 0(log 2 n/£) as part of a 
proposed P2P network. Their system bears many similarities to Chord 10], a 
system for maintaining a DHT which provides 0(logn) routing time using a 
ring-based overlay network with log 2 n LRCs per node. 

In the context of both social and computer networks (particularly those de- 
signed with fault-tolerance in mind), it makes sense to consider graphs in which 
nodes have a random number of LRCs. Fraigniaud and Giakkoupis [4] stud- 
ied the effect of power-law LRC-degree distributions on the ring-based model. 
(We distinguish between LRC-degree distributions, which control the number 
of LRCs assigned to a node, and LRC-distance distributions, which dictate how 
those nodes are chosen). In particular, they consider a family of zeta distribu- 
tions, modified to hold the mean at two regardless of the power-law exponent. 
For directed graphs, greedy routing performs in 0(log 2 n) time, while for undi- 
rected graphs, routing time depends critically on the power-law exponent. 

Work on the corresponding lower bounds considers a broader class of graphs. 
In this model, each node is randomly assigned a set D C {1, ...,n}, which 
contains the distances to that node's LRCs. (This allows random graphs unob- 
tainable with independent LRC-degree and LRC-distance distributions). This 
process is uniform — the distribution used to choose D is the same for all nodes. 
Giakkoupis and Hadzilacos [5[ gave an 

ft(log 2 n/E[|£>|]a lo s* n ) bound on the 
average expected routing time (where a > 1 is a constant), which was later 
improved to f2(log 2 n/E [|D|]) by Dietzfelbinger and Woelfel [3|. 

1.2 Statement of results 

The fi(log 2 n/E [\D\]) lower bound is tight in the sense that the model under 
consideration permits distributions resulting in 0(log 2 n/E [\D\]) routing time, 
such as those studied by Aspnes (if £ LRCs are chosen with replacement, then I > 
E [|-D|]). However, establishing upper bounds for different distributions remains 
an open problem. In this paper, we consider the ring model (with a harmonic 
LRC-distance distribution). We show that if the LRC-degree distribution has 
mean £ and the property that no node can have more than O(logn) LRCs, then 
the expected routing time between any two nodes is 0(\og 2 n/£) (Theorem [1} . 
Hence, this sub-family of graphs provides asymptotically optimal routing time. 

Finally, fixing the mean degree, we investigate what LRC-degree distribu- 
tions optimize greedy routing performance. We give Theorem[21 whose lemmata 
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establish that gaining contacts provides limited returns on the expected length 
of each greedy hop. This holds for any LRC-distance distribution under which 
closer nodes are more likely to be selected as LRCs than those farther away. 
Thus, greedy routing in this family of directed graphs is optimized when LRC- 
degrees do not vary. 



S(u, v) 



2 Model Description 

Let 1Z n = (V, E) be the directed ring graph with n vertices, which we identify 
with the integers: 

V = {0, . . . ,n — 1} , E = {(u,u+1) : u E V} . 

All operations on vertices are performed modulo n. 

Define the function S : V x V — > N to be the distance from u to v along the 
ring: 

v — u if v > u 

n — (u — v) if v < u. 

We wish to construct an augmented graph containing lZ n , but where each 
node has some number of additional out-going edges according to a specified 
distribution. Let p(n, •) be a probability distribution on N (that is, there is a 
different distribution for each value of n). With each node u of lZ n , associate a 
random variable D u taken from this distribution: Pr[_D„ = k] = p(n,k). This 
variable indicates how many additional edges will be attached to u (since these 
edges will be chosen with replacement, they will not in general be distinct). 
In the future, we will write p(n,k) as p(k), with the dependence on n made 
implicit. 

Given u E V and j E N, let A u j £ {1, . . . , n — 1} be a random variable such 
that 

Pr [A u j = r] oc -. 

Note that the proportionality constant is the reciprocal of the (n— l) th harmonic 
number: H~\ = V*) = B(l/logn). 

Define E u = {(u, u + A UJ ) : l<j< D u }, and let E' = E U \J ueV E u . The 
graph H n , P = (V,E') so constructed is a harmonic ring. Given u £ V, let 
C u = {v £ V : (u, v) £ E u }. Elements of C u are long-range contacts (LRCs) 
of u. 

For u, v £ V and A C V, let Pr [ u — > v ] be the probability that (u, v) EE', 
and let Pr [u — > A] be the probability that there exists a node w with (u, w) E 
E'. 

We now introduce some notation to formalize the notion of greedy routing. 
If u and v are nodes of Hn.p, a greedy route from u to v is a sequence u = 
so, si, . . . ,Sk — v such that (sj, Sj+i) E E' and if (sj, w) £ E' , then 5{sjj r \, v) < 
5(w, v) . Since (sj , Sj + 1) £ E', we can always make progress towards v; a greedy 
route exists between arbitrary vertices. Because S(-,v) is injective, the greedy 
route is unique. The greedy routing time from u to v is k. This definition 
formalizes the notion of always taking the route that looks best from a limited, 
local perspective: each node "knows" (has links to) a limited number of other 
nodes, and always passes a message along to the one closest to the destination. 
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Finally, let T r denote the expected greedy routing time when the distance 
between the source and destination nodes is r, and define T n . p to be the average 
expected routing time between all pairs of nodes in H n , P - 

n— 1 

r = - Vt 

n 

3 Routing complexity 

Our upper bound proof follows the same basic outline as Kleinberg's original 
argument: we first find a bound on the expected time it takes to cut an initial 
distance in half, and then couple this with the observation that this must be 
done at most log 2 n times. 



Lemma 1. Let % n ,p — be a harmonic ring. Let u,v £ V be distinct, a 

log n 



let B = {w eV | S(w,v) < S(u,v)/2}. Then Pr[u ->• B \ D u = 1] = G 
Proof. Assume without loss of generality that u — 0. Then 

Pr [ u -> B | D u = 1 ] = ^2~Pt[u->w\D u = 1] 

^ 6(u,w) 



n-1 , , r 



Hr 1 



r=v/2 



Since \ jr is a decreasing function, 



v/2 r , n r v / 2 Jv/2 

r—v/2 



r 



that is, 



lo g 2 < Pr [u -> S | D„ = 1] < ff-ij (2 + log 4) . 
Hence, Pr [ u ->• B | D„ = 1 ] = 9(F„_i) = 6(l/logn). □ 

Lemma[T]makes it easy to work with the probability of cutting the remaining 
distance in half. We will now take advantage of this to formulate and solve a 
recurrence describing how long greedy routing takes. 

Theorem 1. Let T-L n . p be a harmonic ring. Let X be a random variable taken 
from the distribution p{n, •), and let c > be a constant such that for all n, 
Pr[X < clogn] > 0. Then 



log 2 n 



E[X | X < clogn] 



Proof. We will prove that this upper bound holds for the expected routing 
time between arbitrary source-target pairs, Consider the greedy route from u 
to v. How many steps does it take to cut the initial distance in half? We 
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found an answer to this question under the assumption that each node had 
a single LRC, but now require a more general result. As before, define B = 
{w e V | S(w, v) < S(u, v)/2}. The probability that u has an LRC in B is the 
probability that not all of u's contacts miss B: 



Pt[u^B] =^p(d)(l-Pr[u>B | D u = d]) 

d=0 

oo 

= 1-J2 P( d ) (1 -Pr[u ^ B | L>„ = 

The probability that u is linked to a node in B is at least Pr [ u — > B ] , since 
the latter value does not account for the (u, u + 1) edge. Furthermore, the closer 
a message gets to B, the greater its chances of entering B on the next step; that 
is, 5(w, v) < S(w', v) implies Pr [ w — > B] > Pr [ w' — > B}. This follows from the 
fact that Pr [ w — >• v' ] > Pr [ w' — >• v' } for all v' £ B. Therefore if Sj is on the 
greedy route from utov, Pr [ Sj — > B ] > Pr [ u — >• B } . 

If So, . . . Sfc is the greedy route from u to v, let M be the random variable 
defined by M — min{j : Sj E B}. We have 

E [M] < 



Pr [ u -> B } l _ E~ = oP(rf) (1 -Pr[u->- B | D„ = 



Since Pi[u — > i? | = 1] = 0(1/ log n), it follows that there exists some 
positive constant (3 such that for all sufficiently large n, Pr [u — > B \ D u = 1 ] > 
/3/logn. Let x = 1 — /3/logn (although x depends on n we will refrain from 
adding a subscript, so as to avoid clutter). In other words, x is an upper bound 
for the probability that a given LRC fails to cut the remaining distance in half. 
Hence, for large n, 
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call 



The value of A is independent of u and v. Therefore A is an upper bound for 
the expected time it takes to cut the remaining distance in half between any 
two nodes in H n , p . Hence, 

T r < A + max {T s : s < r/2} . 

Since T = 0, this yields: 

T r < Alog 2 r. 

Therefore A log 2 n is an upper bound for the expected routing time between any 
two vertices (and hence is an upper bound for the average expected routing time 
over all pairs of vertices) . Thus 



Let L = [clognj, and define the probability distribution q by: 

n(fJ] _{ P(d)/P if d<L 
q[ ' 1 otherwise ' 
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where P = Pr [ X < L } . Let Y be a random variable taken from the distribution 
q. Then E [X | X < clogn] = E [Y]. Define the function A : N -> E by 

OO 

A(n) = l-^p(d)x d . 

That is, is the expression appearing in the denominator of ([T]). It 

suffices to show that A(n) = fl (jt^f^J ■ 
We have: 

OO OO 

A(n) = Y,P(d) (1 - * d ) = (1 - ^Pid) (1+X + -.. + x^ 1 ) . 

d=Q d=0 

Let / : N -> K be the function f(d) = ^f=o a; 1 - Then 

OO OO L 

]T p(d)f(d)>f(L) J2 p(d) = f(L)(l-P)=f(L)(l/P-l)J2p(d) 

d=L+l d=L+l d=0 

L L L L 

> (i/p i)j2 P (d)f(d) = ^2( q (d)- P (d))f(d) = - Ep^/w- 

Hence, 

oo L 

$>(<W)>£s(<W). 

We know that < x < 1, so whenever 1 < d < L, 

/(d) = 1 + x + • ■ • + a^" 1 > da:' 4 - 1 > dx L . 

Returning to our expression for A(n) and noting that x L = (1 — /?/ log n) cl ° s ™ 
converges to a constant as n grows large, 

L 



A(n) > (1 - x)x L ^ dg(d) = (1 - a^E [Y] = ft 



EJF] 
log n 



This concludes the proof. □ 

If the maximum possible number of LRCs that can be assigned to a particular 
node is O(logn), the result becomes much cleaner. 

Corollary 1. Let H n , p be a harmonic ring where p(n, •) has mean £. Then if 
there is some constant c > such that p(n,d) = whenever d > clogn, then 
T n ,p = 0(log 2 n/£). ' □ 



This bound is tight Q. 
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4 Optimal LRC-degree distributions 



The previous results demonstrate that the asymptotic performance of greedy 
routing depends almost entirely on the mean of the distribution used to choose 
the number of LRCs for each node. Experimentally however, different distribu- 
tions can result in significantly different average routing times. In this section, 
we prove that of those distributions with mean t, greedy routing is optimized 
when every node has |/J or \f\ LRCs. This result holds not just for harmonic 
rings, but in any variant where a closer node is more likely to be selected as an 
LRC than one farther away. 

Let Pr [ j >~ i ] be the probability that a node at distance j from the desti- 
nation routes to a node at distance i from the destination. Using this notation, 

r-1 

T r = l + ^Pr[r >- s}T s (r > 0). 

s=0 

Lemma 2. T r is an increasing function ofr. 

Proof. Let J r be a random variable such that Pr [ J r = s ] = Pr [ r >~ r — s ] . For 
r > 1, define r r = T r — T r _i. Given a < n, assume that t, > whenever i < a. 
Then 

a-1 a-2 

T a = ^Pr[a^r]T r -^]Pr[a-l y r]T r 

7—1 r—1 

a— 1 a— 2 

= X] Pl "[ Ja - r ] T a-r - J^Pr[ J a _ 2 < H T (a-l)-r 
r—1 r—1 

a— 1 a— 1 

= ^ Pr [ J a < r ] r a _ r - ^ Pr [ J a _x < r - 1 ] r a _ r 

r=l r=2 
a-1 

>^(Pr[J a <r]-Pr[J a _ 1 <r-l])r a _ r 

r=2 
> 

The first inequality results from the fact that closer nodes are more likely to be 
chosen as LRCs than those farther away (this is a sufficient condition for the 
proof to work) . The lemma follows by induction. □ 

Lemma 3. Let p be a distribution on N with mean \i, and let f : K — > M be 
a twice- differentiable function with f(x) > 0, f'(x) < 0, and f"(x) > 0. Then 
^2p{d)f(d) is smallest when the support of p is {[fJ.\ 7 Ml- 

This lemma, the proof of which will be omitted, makes a simple statement 
about optimizing the expected value of a function that provides diminishing re- 
turns. When a < \i < b, the benefit of increasing (while decreasing p(a)) 
is greater than corresponding the cost of increasing p([~/U~|) (while decreasing 

When considering what benefit might be obtained for greedy routing by 
varying the LRC-degree distribution, we find that expected route length is gov- 
erned by this lemma. That is, roughly speaking, a node gets diminishing returns 
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on the expected jump lengths it can provide with each additional LRC it is allo- 
cated. The following theorem argues that since longer jumps are always better 
(Lemma [2]) , the best thing to do is to ensure that LRC-degree selection varies 
as little as possible (Lemma [3]) • 

Theorem 2. Let Se be the set of probability distributions on N with mean 
< £ N. Let p £ Se be the distribution with support ["•£]}. Then for all 

q G S(, 7n,p < 7n,q- 

Proof. Consider two arbitrary nodes, u and v. We will show that the expected 
routing time from u to v is smallest when D u is chosen according to p, and that 
this is true regardless of what distribution is used to choose D u+ i, . . . ,D V (as 
long as the same distribution is used for all them)!]] 

So assume that D u +i, . . . , D v are chosen from the same distribution (keeping 
Lemma [2] applicable) . As before, let n — Tj — Tj_i (here we will restrict the 
definitions of and Tj to refer only to greedy paths where the destination node 
is v); by Lemma [U > 0. Let A be a random variable such that Pr [ A = r] 
is equal to the probability that u routes to u + r. Define T r (d) to be T r given 
that the source node, u, has been assigned d LRCs. 

r 

T r (d) = 1 + ^Pr [ A = s | D u = d] T r _ s 

s=l 
r 

= 1 + ^Pr [ A < s | D u = d] T r _ s 

s=l 
r 

= l + J]Pr[A < s | D u = l] rf r r _ s . 

This last equality allows us to extend the definition of T r (d) to include all del. 
Letting a r — Pr [ A < r D u = 1 ] , we have, for all d, 

r 

Kid) = ^(log a)afr r ^ s <0 

and 

r 

T"(d) =^(log 2 a)afr r _ s > 0. 

3 = 1 

By Lemma |31 T r — q{d)T r {d) is smallest when q = p. Hence using p for all 
nodes simultaneously minimizes routing times over all distances. □ 
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